Preparation

Descriptive and Demographic Information

In total we have data from 2303 participants. Of those 2298 participants finished the experiment or failed at the captcha stage (i.e., did not abandon the study throughout). Of those, 2278 solved two or more captchas and proceeded to the roulette part. All following analyses are based on these 2278 participants.

The distribution of conditions is as follows (prop = proportion):

## # A tibble: 3 × 3
##   expt_cond          n  prop
##   <fct>          <int> <dbl>
## 1 No Message       755 0.331
## 2 Banner           796 0.349
## 3 Popup & Banner   727 0.319

Participants with and without experience in online roulette:

## # A tibble: 2 × 3
##   roulette     n  prop
##   <fct>    <int> <dbl>
## 1 Yes       1786 0.784
## 2 No         492 0.216

Some demographic information:

## # A tibble: 5 × 3
##   gender                     n    prop
##   <chr>                  <int>   <dbl>
## 1 Female                   965 0.424  
## 2 Male                    1289 0.566  
## 3 Non-binary                15 0.00658
## 4 None                       5 0.00219
## 5 Prefer Not to Disclose     4 0.00176
##           vars    n  mean    sd median trimmed   mad min   max range skew kurtosis   se
## age          1 2276 35.94 10.98     34   34.99 10.38  18  87.0  69.0 0.82     0.47 0.23
## bonus        2 2278  4.91  3.45      5    4.79  1.04   0  79.4  79.4 7.56   123.73 0.07
## bet_count    3 2278  5.97 12.94      3    3.39  4.45   0 198.0 198.0 7.00    70.84 0.27

Some conditional demographic information:

## # A tibble: 6 × 4
##   gender expt_cond          n  prop
##   <chr>  <fct>          <int> <dbl>
## 1 Female No Message       318 0.140
## 2 Female Banner           329 0.144
## 3 Female Popup & Banner   318 0.140
## 4 Male   No Message       427 0.187
## 5 Male   Banner           462 0.203
## 6 Male   Popup & Banner   400 0.176
## 
##  Descriptive statistics by group 
## expt_cond: No Message
##            vars   n  mean    sd median trimmed  mad min   max range skew kurtosis   se
## expt_cond*    1 755  1.00  0.00      1    1.00 0.00   1   1.0   0.0  NaN      NaN 0.00
## age           2 754 35.82 10.71     34   34.82 9.64  18  81.0  63.0 0.92     0.82 0.39
## bonus         3 755  4.98  4.18      5    4.79 1.48   0  79.4  79.4 9.52   147.49 0.15
## bet_count     4 755  6.22 13.10      3    3.57 4.45   0 187.0 187.0 6.64    65.17 0.48
## ---------------------------------------------------------------------------------- 
## expt_cond: Banner
##            vars   n  mean    sd median trimmed   mad min   max range skew kurtosis   se
## expt_cond*    1 796  2.00  0.00      2    2.00  0.00   2   2.0   0.0  NaN      NaN 0.00
## age           2 795 36.20 11.12     34   35.28 10.38  19  87.0  68.0 0.79     0.31 0.39
## bonus         3 796  4.95  3.43      5    4.81  0.74   0  46.6  46.6 5.20    50.70 0.12
## bet_count     4 796  5.57 10.36      3    3.33  4.45   0 104.0 104.0 4.67    28.74 0.37
## ---------------------------------------------------------------------------------- 
## expt_cond: Popup & Banner
##            vars   n  mean    sd median trimmed   mad min max range skew kurtosis   se
## expt_cond*    1 727  3.00  0.00      3    3.00  0.00   3   3     0  NaN      NaN 0.00
## age           2 727 35.79 11.10     34   34.85 10.38  18  78    60 0.77     0.29 0.41
## bonus         3 727  4.80  2.52      5    4.78  0.89   0  20    20 1.28     7.20 0.09
## bet_count     4 727  6.14 15.15      3    3.30  4.45   0 198   198 7.56    72.68 0.56

Now let’s take a look at the PGSI scores:

## # A tibble: 1 × 2
##   pgsi_mean pgsi_sd
##       <dbl>   <dbl>
## 1      2.49    3.76
## # A tibble: 3 × 3
##   expt_cond      pgsi_mean pgsi_sd
##   <fct>              <dbl>   <dbl>
## 1 No Message          2.51    3.90
## 2 Banner              2.45    3.76
## 3 Popup & Banner      2.52    3.61

Of the total of 2278 participants, 579 participants did not bet at all. We can take a look at the bonus as a function of whether or not participant bet:

## # A tibble: 2 × 7
##   bet_at_all  mean median    sd     n   min   max
##   <lgl>      <dbl>  <dbl> <dbl> <int> <dbl> <dbl>
## 1 FALSE       5         5  0      579     5   5  
## 2 TRUE        4.89      5  4.00  1699     0  79.4
## # A tibble: 6 × 8
## # Groups:   expt_cond [3]
##   expt_cond      bet_at_all  mean median    sd     n   min   max
##   <fct>          <lgl>      <dbl>  <dbl> <dbl> <int> <dbl> <dbl>
## 1 No Message     FALSE       5       5    0      186     5   5  
## 2 No Message     TRUE        4.98    5    4.81   569     0  79.4
## 3 Banner         FALSE       5       5    0      212     5   5  
## 4 Banner         TRUE        4.94    5    4.00   584     0  46.6
## 5 Popup & Banner FALSE       5       5    0      181     5   5  
## 6 Popup & Banner TRUE        4.74    4.9  2.90   546     0  20

Note, the expected loss when betting in our roulette task is 1/37, so participants are expected to retain 36/37 when betting:

\[£5 \times \frac{36}{37} \approx £4.86\]

Next let’s take a look at our main DV, proportion of money bet, which is defined as follows: \[\texttt{prop_bet} = \frac{\texttt{amount}}{5 + \texttt{total_win}}\]

## # A tibble: 3 × 3
##   expt_cond      prop_bet_mean prop_bet_sd
##   <fct>                  <dbl>       <dbl>
## 1 No Message             0.344       0.322
## 2 Banner                 0.323       0.321
## 3 Popup & Banner         0.316       0.319

Distribution of DV

Our DV clearly does not look normally distributed.

## # A tibble: 1 × 3
##   gamble_at_all gamble_everything proportion_bet_rest
##           <dbl>             <dbl>               <dbl>
## 1         0.746             0.122               0.361
## # A tibble: 1 × 3
##   no_gamble gamble_at_all gamble_everything
##       <int>         <int>             <int>
## 1       579          1699               208

Binomial confidence or credibility intervals for the probability to gamble at all:

##           method    x    n      mean     lower     upper
## 1  agresti-coull 1699 2278 0.7458297 0.7275419 0.7632898
## 2     asymptotic 1699 2278 0.7458297 0.7279503 0.7637091
## 3          bayes 1699 2278 0.7457218 0.7277920 0.7635307
## 4        cloglog 1699 2278 0.7458297 0.7274299 0.7631969
## 5          exact 1699 2278 0.7458297 0.7274217 0.7636032
## 6          logit 1699 2278 0.7458297 0.7275397 0.7632913
## 7         probit 1699 2278 0.7458297 0.7276259 0.7633743
## 8        profile 1699 2278 0.7458297 0.7276804 0.7634263
## 9            lrt 1699 2278 0.7458297 0.7276681 0.7634233
## 10     prop.test 1699 2278 0.7458297 0.7273225 0.7634990
## 11        wilson 1699 2278 0.7458297 0.7275467 0.7632850

Distribution per condition:

Hypothesis 1: Zero-One Inflated Beta Regression on Proportion Bet

We use a custom parameterization of a zero-one-inflated beta-regression model (see also here). The likelihood of the model is given by:

\[\begin{align} f(y) &= (1 - g) & & \text{if } y = 0 \\ f(y) &= g \times e & & \text{if } y = 1 \\ f(y) &= g \times (1 - e) \times \text{Beta}(a,b) & & \text{if } y \notin \{0, 1\} \\ a &= \mu \times \phi \\ b &= (1-\mu) \times \phi \end{align}\]

Where \(1 - g\) is the zero inflation probability, zipp is \(g\) and reflects the probability to gamble, \(e\) is the conditional one-inflation probability (coi) or conditional probability to gamble everything (i.e., conditional probability to have a value of one, if one gambles), \(\mu\) is the mean of the beta distribution (Intercept), and \(\phi\) is the precision of the beta distribution (phi). As we use Stan for modelling, we need to model on the real line and need appropriate link functions. For \phi the link is log (inverse is exp()), for all other parameters it is logit (inverse is plogis()).

We fit this model and add experimental condition as a factor to the three main model parameters (i.e., only the precision parameter is fixed across conditions). The following table provides the overview of the model and all model parameters and show good convergence.

##  Family: zoib2 
##   Links: mu = logit; phi = log; zipp = logit; coi = logit 
## Formula: prop_bet ~ 0 + Intercept + expt_cond 
##          phi ~ 1
##          zipp ~ 0 + Intercept + expt_cond
##          coi ~ 0 + Intercept + expt_cond
##    Data: duse (Number of observations: 2278) 
##   Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
##          total post-warmup draws = 1e+05
## 
## Population-Level Effects: 
##                            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## phi_Intercept                  1.18      0.03     1.11     1.24 1.00   108609    75461
## Intercept                     -0.45      0.04    -0.53    -0.36 1.00    70546    71228
## expt_condBanner               -0.10      0.06    -0.22     0.02 1.00    78123    76045
## expt_condPopup&Banner         -0.11      0.06    -0.23     0.01 1.00    79997    76023
## zipp_Intercept                 1.12      0.08     0.96     1.29 1.00    72578    70218
## zipp_expt_condBanner          -0.11      0.12    -0.33     0.12 1.00    80464    74831
## zipp_expt_condPopup&Banner    -0.01      0.12    -0.25     0.22 1.00    82014    76326
## coi_Intercept                 -1.99      0.13    -2.25    -1.74 1.00    70881    68608
## coi_expt_condBanner            0.07      0.18    -0.29     0.42 1.00    78531    71854
## coi_expt_condPopup&Banner     -0.04      0.19    -0.41     0.32 1.00    80965    75928
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

As a visual convergence check, we plot the density and trace plots for the four intercept parameters representing the no message (control) condition or the overall mean (for phi).

The model does not have any obvious problems, even without priors for the condition specific effects.

Posterior Predictive Checks

As expected the synthetic data generated from the model looks a lot like the actual data. This suggests that the model is adequate for the data.

Proportion Bet

Our hypothesis is a bout proportion bet, \(Pr_{bet}\) which is given by:

\[Pr_{bet} = (g * e) + (g * (1-e) * \mu)\]

The following show the resulting \(Pr_{bet}\) posterior distributions across conditions.

## # A tibble: 3 × 7
##   expt_cond      prop_bet .lower .upper .width .point .interval
##   <fct>             <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 No Message        0.349  0.327  0.373   0.95 mean   qi       
## 2 Banner            0.328  0.306  0.351   0.95 mean   qi       
## 3 Popup & Banner    0.329  0.306  0.352   0.95 mean   qi
## # A tibble: 2 × 7
##   expt_cond                   prop_bet  .lower .upper .width .point .interval
##   <chr>                          <dbl>   <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message          -0.0212 -0.0534 0.0111   0.95 mean   qi       
## 2 Popup & Banner - No Message  -0.0203 -0.0526 0.0123   0.95 mean   qi

Individual ZOIBR Parameters

Mu:

##  expt_cond      response lower.HPD upper.HPD
##  No Message        0.390     0.369     0.410
##  Banner            0.366     0.347     0.386
##  Popup & Banner    0.364     0.344     0.384
## 
## Point estimate displayed: median 
## Results are back-transformed from the logit scale 
## HPD interval probability: 0.95

g:

##  expt_cond      response lower.HPD upper.HPD
##  No Message        0.754     0.723     0.784
##  Banner            0.734     0.703     0.764
##  Popup & Banner    0.751     0.719     0.781
## 
## Point estimate displayed: median 
## Results are back-transformed from the logit scale 
## HPD interval probability: 0.95

e:

##  expt_cond      response lower.HPD upper.HPD
##  No Message        0.121    0.0942     0.148
##  Banner            0.128    0.1017     0.156
##  Popup & Banner    0.117    0.0906     0.144
## 
## Point estimate displayed: median 
## Results are back-transformed from the logit scale 
## HPD interval probability: 0.95

Hypothesis 1: Control Analysis with PGSI

Variant 1: PGSI Entered as Main Effect

##  Family: zoib2 
##   Links: mu = logit; phi = log; zipp = logit; coi = logit 
## Formula: prop_bet ~ expt_cond + pgsi_c 
##          phi ~ 1
##          zipp ~ expt_cond + pgsi_c
##          coi ~ expt_cond + pgsi_c
##    Data: duse (Number of observations: 2278) 
##   Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
##          total post-warmup draws = 1e+05
## 
## Population-Level Effects: 
##                            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                     -0.45      0.04    -0.54    -0.37 1.00   147289    81390
## phi_Intercept                  1.18      0.03     1.12     1.25 1.00   192152    79733
## zipp_Intercept                 1.14      0.08     0.97     1.30 1.00   137836    81842
## coi_Intercept                 -2.04      0.13    -2.30    -1.79 1.00   149622    78426
## expt_condBanner               -0.10      0.06    -0.22     0.02 1.00   144832    85026
## expt_condPopup&Banner         -0.11      0.06    -0.23     0.01 1.00   147100    84862
## pgsi_c                         0.02      0.01     0.01     0.03 1.00   243969    74720
## zipp_expt_condBanner          -0.10      0.12    -0.33     0.13 1.00   137959    84597
## zipp_expt_condPopup&Banner    -0.02      0.12    -0.26     0.22 1.00   133246    87221
## zipp_pgsi_c                    0.07      0.02     0.04     0.10 1.00   195951    77443
## coi_expt_condBanner            0.06      0.18    -0.29     0.42 1.00   150039    85441
## coi_expt_condPopup&Banner     -0.04      0.19    -0.41     0.32 1.00   148403    86039
## coi_pgsi_c                     0.09      0.02     0.06     0.12 1.00   187549    80180
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

As a visual convergence check, we plot the density and trace plots for the four intercept pasrameters representing the no message condition or the overall mean (for phi).

Let’s then take a look at the difference distribution of proportion bet after adjusting for PGSI:

## Joining, by = c("expt_cond", ".chain", ".iteration", ".draw")
## Joining, by = c("expt_cond", ".chain", ".iteration", ".draw")
## # A tibble: 2 × 7
##   expt_cond                   prop_bet  .lower .upper .width .point .interval
##   <chr>                          <dbl>   <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message          -0.0210 -0.0527 0.0110   0.95 mean   qi       
## 2 Popup & Banner - No Message  -0.0205 -0.0527 0.0120   0.95 mean   qi

Variant 2: PGSI Entered as Interaction

##  Family: zoib2 
##   Links: mu = logit; phi = log; zipp = logit; coi = logit 
## Formula: prop_bet ~ expt_cond * pgsi_c 
##          phi ~ 1
##          zipp ~ expt_cond * pgsi_c
##          coi ~ expt_cond * pgsi_c
##    Data: duse (Number of observations: 2278) 
##   Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
##          total post-warmup draws = 1e+05
## 
## Population-Level Effects: 
##                                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                            -0.45      0.04    -0.54    -0.37 1.00   118011    78112
## phi_Intercept                         1.19      0.03     1.12     1.25 1.00   131837    77887
## zipp_Intercept                        1.13      0.09     0.96     1.30 1.00   107774    76676
## coi_Intercept                        -2.02      0.13    -2.29    -1.77 1.00   110396    74054
## expt_condBanner                      -0.10      0.06    -0.22     0.02 1.00   114287    78534
## expt_condPopup&Banner                -0.11      0.06    -0.23     0.01 1.00   116914    81953
## pgsi_c                               -0.00      0.01    -0.03     0.02 1.00    87041    73534
## expt_condBanner:pgsi_c                0.02      0.02    -0.01     0.05 1.00    96283    79737
## expt_condPopup&Banner:pgsi_c          0.05      0.02     0.02     0.08 1.00    96405    78454
## zipp_expt_condBanner                 -0.08      0.12    -0.31     0.15 1.00   110693    79469
## zipp_expt_condPopup&Banner            0.00      0.12    -0.24     0.24 1.00   111341    82515
## zipp_pgsi_c                           0.04      0.02    -0.01     0.09 1.00    86850    75623
## zipp_expt_condBanner:pgsi_c           0.05      0.04    -0.03     0.12 1.00    94416    79556
## zipp_expt_condPopup&Banner:pgsi_c     0.05      0.04    -0.03     0.12 1.00    96439    79162
## coi_expt_condBanner                   0.03      0.19    -0.33     0.40 1.00   113047    82707
## coi_expt_condPopup&Banner            -0.09      0.19    -0.47     0.29 1.00   111429    80151
## coi_pgsi_c                            0.06      0.03     0.01     0.12 1.00    84501    69538
## coi_expt_condBanner:pgsi_c            0.03      0.04    -0.05     0.10 1.00    91648    76296
## coi_expt_condPopup&Banner:pgsi_c      0.04      0.04    -0.04     0.12 1.00    94486    80426
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

As a visual convergence check, we plot the density and trace plots for the four intercept pasrameters representing the no message condition or the overall mean (for phi).

Let’s then take a look at the difference distribution of proportion bet after adjusting for PGSI:

## # A tibble: 2 × 7
##   expt_cond                   prop_bet  .lower .upper .width .point .interval
##   <chr>                          <dbl>   <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message          -0.0209 -0.0531 0.0112   0.95 mean   qi       
## 2 Popup & Banner - No Message  -0.0210 -0.0537 0.0116   0.95 mean   qi

Hypothesis 2: Clicks (Yes/No) on the GamCare information page

Let’s begin with some simple descriptive statistics of the clicks on the GamCare page.

## # A tibble: 3 × 5
##   expt_cond      proportion    sd success     n
##   <fct>               <dbl> <dbl>   <int> <int>
## 1 No Message         0.0278 0.165      21   755
## 2 Banner             0.0289 0.168      23   796
## 3 Popup & Banner     0.0248 0.155      18   727

Model shows no obvious convergence problems:

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: gamcare_click ~ expt_cond 
##    Data: duse (Number of observations: 2278) 
##   Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
##          total post-warmup draws = 1e+05
## 
## Population-Level Effects: 
##                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                -3.57      0.22    -4.03    -3.16 1.00    62390    57127
## expt_condBanner           0.04      0.31    -0.56     0.65 1.00    67558    66366
## expt_condPopup&Banner    -0.12      0.33    -0.77     0.52 1.00    65300    67457
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Let’s take a look at the predicted probabilities and differences

##  expt_cond      response lower.HPD upper.HPD
##  No Message       0.0276    0.0167    0.0399
##  Banner           0.0287    0.0181    0.0411
##  Popup & Banner   0.0245    0.0143    0.0365
## 
## Point estimate displayed: median 
## Results are back-transformed from the logit scale 
## HPD interval probability: 0.95
## # A tibble: 2 × 7
##   expt_cond                       prob  .lower .upper .width .point .interval
##   <chr>                          <dbl>   <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message          0.00110 -0.0155 0.0178   0.95 mean   qi       
## 2 Popup & Banner - No Message -0.00304 -0.0193 0.0133   0.95 mean   qi

Now the results figure:

Hypothesis 3a: Mean Speed Of Play

There are total of 13590 bets. If we remove the first bet of each participant, 11891 bets remain. Of those, 29 (0.24%) bets took longer than 120 seconds. Following our pre-registration, we remove these betting times from analysis.

The following histogram shows the distribution of betting times.

We can also take a look at some descriptive statistics of the distribution:

## # A tibble: 1 × 3
##   time_mean time_median time_sd
##       <dbl>       <dbl>   <dbl>
## 1      9.69        5.85    11.2
## # A tibble: 3 × 4
##   expt_cond      time_mean time_median time_sd
##   <fct>              <dbl>       <dbl>   <dbl>
## 1 No Message          9.30        5.90    9.87
## 2 Banner              9.27        5.39   11.5 
## 3 Popup & Banner     10.5         6.29   12.1

We analyse the betting times shown above using a shifted-lognormal model with by-participant random intercepts for the log-mean allowing both log-mean and log-SD to vary across message conditions. The following shows the model summary (which show no obvious convergence problems).

##  Family: shifted_lognormal 
##   Links: mu = identity; sigma = log; ndt = identity 
## Formula: time ~ expt_cond + (1 | ppt_id) 
##          sigma ~ expt_cond
##    Data: times_use2 (Number of observations: 11862) 
##   Draws: 10 chains, each with iter = 2001; warmup = 334; thin = 3;
##          total post-warmup draws = 5557
## 
## Group-Level Effects: 
## ~ppt_id (Number of levels: 1418) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.68      0.02     0.65     0.72 1.00     8005    11560
## 
## Population-Level Effects: 
##                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                       1.89      0.04     1.82     1.96 1.00     6868    11053
## sigma_Intercept                -0.24      0.01    -0.26    -0.21 1.00    16715    16012
## expt_condBanner                -0.05      0.05    -0.16     0.05 1.00     7741    11117
## expt_condPopup&Banner           0.03      0.05    -0.07     0.13 1.00     6400    10288
## sigma_expt_condBanner           0.05      0.02     0.02     0.09 1.00    16442    15747
## sigma_expt_condPopup&Banner     0.03      0.02    -0.00     0.06 1.00    16317    15688
## 
## Family Specific Parameters: 
##     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## ndt     0.64      0.01     0.61     0.66 1.00    16631    15956
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

The summary table shows that one of the message specific parameters for the log-SD provides evidence for a difference between the no message and message condition (the 95% CI for sigma_expt_condMessage1 does not include 0).

The model shows no obvious convergence problems:

The model is also able to adequately reproduce the shape of the observed data.

Our hypothesis is about the mean betting time, which we need to calculate from the model parameters log-mean m and log-SD sigma as mean = exp(m + sigma^2/2).

The following table shows the predicted mean betting times which are similar to the observed ones and reproduce the ordering of conditions means.

## # A tibble: 3 × 7
##   expt_cond       mean .lower .upper .width .point .interval
##   <fct>          <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 No Message      9.02   8.38   9.70   0.95 mean   qi       
## 2 Banner          8.87   8.23   9.55   0.95 mean   qi       
## 3 Popup & Banner  9.48   8.77  10.2    0.95 mean   qi

We can also take a look at the differences from the no message condition:

## # A tibble: 2 × 7
##   expt_cond                     mean .lower .upper .width .point .interval
##   <chr>                        <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message         -0.156 -1.09   0.783   0.95 mean   qi       
## 2 Popup & Banner - No Message  0.454 -0.529  1.42    0.95 mean   qi

Now the results figure:

Hypothesis 3b: Total Number of Spins

The following histograms shows the distribution of the number of spins.

Some descriptive statistics on the number of non-zero bet counts is:

## # A tibble: 3 × 4
##   expt_cond      bet_count_mean bet_count_median bet_count_sd
##   <fct>                   <dbl>            <dbl>        <dbl>
## 1 No Message               8.25                4         14.5
## 2 Banner                   7.59                4         11.4
## 3 Popup & Banner           8.18                4         17.0

Following the preregistration, we analyse the distribution after excluding all observations with 0 spins. We then use a negative binomial model to describe the data.

This model shows no obvious convergence problems.

##  Family: negbinomial 
##   Links: mu = log; shape = identity 
## Formula: bet_count | trunc(lb = 1) ~ expt_cond 
##    Data: part_nozero (Number of observations: 1699) 
##   Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
##          total post-warmup draws = 1e+05
## 
## Population-Level Effects: 
##                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                 1.43      0.11     1.19     1.63 1.00    49824    48954
## expt_condBanner          -0.11      0.10    -0.30     0.09 1.00    58818    60915
## expt_condPopup&Banner    -0.01      0.10    -0.21     0.19 1.00    60616    59596
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape     0.25      0.03     0.18     0.31 1.00    48230    46684
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

The data seems to be well described by the model.

When we zoom in (i.e., ignore data points above 50 for the plot), we can see the that the real and synthetic data match quite well.

Then let’s take a look at the predicted number of spins. To do so it is important to note that the mean parameter gives the mean of the non-truncated negative binomial distribution and not the mean of the truncated distribution.

However, we can derive the mean of the truncated distribution from first principles. More specifically, for any random variable \(X\) truncated such that \(X > y\) and with density function of the non-truncated distribution \(f(x)\) and corresponding cumulative density function \(F(x)\) its expectation \(E(X|X > y)\) (or mean) is given by (see e.g., Wikipedia) \[ E(X|X > y) = \frac{\int_y^\infty x g(x) dx}{1 - F(y)}. \] In words this formula says that the the expectation of the truncated random variable is given by the expectation derived for the truncated part of the non-truncated random variable (the numerator) divided by the probability of the truncated part.

The difficulty in calculating this expectation is of course the integral in the numerator. However, because the negative binomial distribution is a discrete probability distribution and we truncate it such that \(X > 0\) this calculation is trivial in the present case.

Note that in the case of a discrete variable, the expectation of the truncated variable becomes the following, \[ E(X|X > y) = \frac{\sum_{x=y + 1}^\infty x\, f(x)}{1 - F(y)}. \] Given this formulation it is easy to see that the expectation of the full (i.e., non-truncated) negative binomial distribution, \(E(X) = \sum_{x=0}^\infty x\, f(x)\) is equal to the term in the numerator of the truncated expectation if \(y = 0\). The reason for this is that the first term of the sum in \(E(X)\) is zero if \(x = 0\). Hence, for the negative binomial truncated at zero, the expectation is given by \[ E(X|X > 0) = \frac{E(X)}{1 - F(0)}. \] Using this formula, we can now calculate the predicted (or estimated) number of mean spins:

## # A tibble: 3 × 7
##   expt_cond      trunc_mean .lower .upper .width .point .interval
##   <fct>               <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 No Message           8.28   7.44   9.23   0.95 mean   qi       
## 2 Banner               7.62   6.87   8.47   0.95 mean   qi       
## 3 Popup & Banner       8.20   7.36   9.17   0.95 mean   qi
## # A tibble: 2 × 7
##   expt_cond                   trunc_mean .lower .upper .width .point .interval
##   <chr>                            <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 Banner - No Message            -0.661   -1.87  0.539   0.95 mean   qi       
## 2 Popup & Banner - No Message    -0.0736  -1.34  1.20    0.95 mean   qi

Now the results figure:

System

## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
##  [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
##  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
## [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] binom_1.1-1      emmeans_1.7.3    tidybayes_3.0.2  brms_2.16.3      Rcpp_1.0.8.3     forcats_0.5.1   
##  [7] stringr_1.4.0    dplyr_1.0.8      purrr_0.3.4      readr_2.1.2      tidyr_1.2.0      tibble_3.1.6    
## [13] ggplot2_3.3.5    tidyverse_1.3.1  checkpoint_1.0.2
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.4.0         backports_1.4.1      plyr_1.8.7           igraph_1.3.0         svUnit_1.0.6        
##   [6] crosstalk_1.2.0      rstantools_2.2.0     inline_0.3.19        digest_0.6.29        htmltools_0.5.2     
##  [11] fansi_1.0.3          magrittr_2.0.3       checkmate_2.0.0      tzdb_0.3.0           modelr_0.1.8        
##  [16] RcppParallel_5.1.5   matrixStats_0.61.0   vroom_1.5.7          xts_0.12.1           prettyunits_1.1.1   
##  [21] colorspace_2.0-3     rvest_1.0.2          ggdist_3.1.1         haven_2.4.3          xfun_0.30           
##  [26] callr_3.7.0          crayon_1.5.1         jsonlite_1.8.0       zoo_1.8-9            glue_1.6.2          
##  [31] gtable_0.3.0         distributional_0.3.0 pkgbuild_1.3.1       rstan_2.21.3         abind_1.4-5         
##  [36] scales_1.1.1         mvtnorm_1.1-3        DBI_1.1.2            miniUI_0.1.1.1       xtable_1.8-4        
##  [41] tmvnsim_1.0-2        bit_4.0.4            stats4_4.1.3         StanHeaders_2.21.0-7 DT_0.22             
##  [46] htmlwidgets_1.5.4    httr_1.4.2           threejs_0.3.3        arrayhelpers_1.1-0   posterior_1.2.1     
##  [51] ellipsis_0.3.2       pkgconfig_2.0.3      loo_2.5.1            farver_2.1.0         sass_0.4.1          
##  [56] dbplyr_2.1.1         utf8_1.2.2           tidyselect_1.1.2     labeling_0.4.2       rlang_1.0.2         
##  [61] reshape2_1.4.4       later_1.3.0          munsell_0.5.0        cellranger_1.1.0     tools_4.1.3         
##  [66] cli_3.2.0            generics_0.1.2       broom_0.7.12         ggridges_0.5.3       evaluate_0.15       
##  [71] fastmap_1.1.0        yaml_2.3.5           processx_3.5.3       knitr_1.38           bit64_4.0.5         
##  [76] fs_1.5.2             nlme_3.1-155         mime_0.12            xml2_1.3.3           compiler_4.1.3      
##  [81] bayesplot_1.9.0      shinythemes_1.2.0    rstudioapi_0.13      reprex_2.0.1         bslib_0.3.1         
##  [86] stringi_1.7.6        highr_0.9            ps_1.6.0             Brobdingnag_1.2-7    lattice_0.20-45     
##  [91] Matrix_1.4-0         psych_2.2.3          markdown_1.1         shinyjs_2.1.0        tensorA_0.36.2      
##  [96] vctrs_0.4.0          pillar_1.7.0         lifecycle_1.0.1      jquerylib_0.1.4      bridgesampling_1.1-2
## [101] estimability_1.3     cowplot_1.1.1        httpuv_1.6.5         R6_2.5.1             promises_1.2.0.1    
## [106] gridExtra_2.3        codetools_0.2-18     colourpicker_1.1.1   gtools_3.9.2         assertthat_0.2.1    
## [111] withr_2.5.0          shinystan_2.6.0      mnormt_2.0.2         parallel_4.1.3       hms_1.1.1           
## [116] grid_4.1.3           coda_0.19-4          rmarkdown_2.13       shiny_1.7.1          lubridate_1.8.0     
## [121] base64enc_0.1-3      dygraphs_1.1.1.6